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ABSTRACT 

An investigation was undertaken of the 
intercorrelations among residuals obtained by applying regression 
techniques under two conditions: (1) mean sixth-grade Metropolitan 
Achievement Test reading scores predicted by mean second-grade 
Stanford Achievement Test reading scores, and (2) mean sixth^^grade 
reading scores predicted by mean second-grade reading scores and 
mobility variables. The predictions were performed on data for 66 
elementary schools for three consecutive years. The addition cf 
mobility variables significantly improved the prediction cpodel. 
Intercorrelations among residuals within year and across models were 
uniformly high. (.77 to .94). Intercorrelations across years were not 
improved by the addition of mobility variables (.46 to -SIJ. 
(Author) 
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Some Empirical Evidence on the Stability of Discrepancy 
Measures Based on Observed and Pr;edicted School Meaxis 
on Achievement Tests. 

Joseph F. Gastright 
Cine innat i Pu ol ic Schools 

The conceptualization of school systems as input/output systems has 
led to the application of variov.s mathematical models for predicting 
school outputs. Dyer (I966) proposed the use of multiple llnesLr regres- 
sion techniques for obtaining discrepancy measures based on observed and 
predicted school system achievement means. Dyer^ Linn^ and Patton (1969) 
presented empirical evidence on the comparability of discreparcy measures 
obtained using four different methods. 

Method I utilized the regression of individual student output achieve- 
ment on student input achieveinent using a student sample identical at two 
grade levels (matched longitudinal sajnple ) . Method II utilized the regres- 
sion of mean school system output on mean school system input for the same 
matched-longitudinal sample. Method III \itilized the regression of mean 
school system output on mean school system input for all students avail- 
able at those points in time (unmatched-longitudinal sample). Method IV 
utilized regression of mean school system output on the concurrent school 
system mean of the earlier grade level (cross-sectional sej^iple ) . 

The "Student -Change Model" of an Educational System proposed by 
Dyer (i-XSyj included among the independent variables not only input achieve- 
ment but also ir.easures of the hard to change conditions (e.g. students 
socir>economic status^ comm.^anity wealth) and educational process variables 
(e.g. variety of teaching methods). Data on the surrounding condition and 
process variables were not available^ however^ to Dyer and his associates 
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in the data base of the New York study. 

Dyer^ et . al, using fifth grade acaievement scores to predict eighth 
grade achievement scores (lov^a Test of Basic Skills), concluded that 
Methods I and II were essentially interchangeable, but not comparable 
to Methods III and IV. Operating under the assumption that the methods 
utilizing matched student samples were intrinsically superior to the 
others, they concluded that Methods III and IV did not produce residuals 
which were sufficiently comparable to those from Methods I and II to 
serve as reasonable substitutes for them. The stability of the residuals 
produced by Methods I and II was estimated by randomly dividing the 
student population into two groups and correlating the residuals produced 
oy these two independent estimates. They reported a median correlation 
across the subtests of .70. This procedure randomizes the various fac- 
tors that have contributed to differential educational programs and gives 
an estimate of the amount of error associated with pupils. 

Forsyt.h (1973) provided some evidence on a different kind of sta- 
bility, the consistency of residuals for consecutive classes in the seine 
school. He randomly seunpled 50 students from each of 3^0 schools in 
lova and utilizing Dyer^s Method II, predicted mean school twelfth grade 
achievenient scores using mean school ninth grade achievement scores 
(Iowa. Test of Basic Skills). The multiple correlation coefficients re- 
ported by Forsyth are very consistent with those reported by Dyer. How- 
ever, tl\e intercorrelation between residuals for the consecutive years 
(median r^2Q) were considerably lower than the random halves correlations 
reported by Dyer . 

Acland (I972) reported somewhat higher intercorrelations among resid- 
uals (r^-J+y for consecutive classes using unmatched student groups (j^yer's 
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Method III). Using longitudinal achievement test data from grades one 
to six (Metropolitan Achievement Test), Acland found that the intercor- 
relation betveen residuals for consecutive classes increased as the inter- 
val between the testing dates increased from one to thi^ee years. Acland 
interpreted the correlation between residuals as "a measure of the percen- 
tage of the variation in the residual scores that can be attributed to the 
stable characteristics of schools that raise performance level." 

This study investigates the intercorrelatlons among residuals for 
three consecutive years us. ng unmatched longitudinal student sajnple 
(Dyer's Method III). The results were obtained under two conditions; 
1) mean school sixth grade reading achievement predicted by mean school 
second grade reading achievement, and 2) mean school sixth grade reading 
achievement predicted by mean school second grade reading achievement etnd 
school and community background variables collected concurrently with the 
second grade achievement testing. The multiple regressions and intercor- 
relation among residuals for three consecutive years were compared to de- 
termine whether X) the addition of backg^rouiid variables signif icaxitly 
increased the predictive acciaracy of the regression model, and 2) these 
factors increase the year to year stability of residuals obtained using 
the regression model. 
P rocedures 

Longitudinal school mean data are available on 6k elementary schools 
in vhe Cincinnati Public School System. Second grade Stanford Achievement 
test means in reading comprehension, and sixth grade Metrop<jlitan Achieve- 
ment test means in paragraph meaning were available for all Bchools for 
tliree consecutive years (I967-7I, 196d-Y2, V-)6^j-"( 'S ) ^ The School Information 
System, a computerized data bank, routinely collects and reports a variety 
01 data on schools in the system. The mobility and background valuables 
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were utilized to develop the regression eqmtions for this study. 

Intercorrelations between mobility , back^ound variables , and mean 
achool reading achievement were used to limit the number uf variables 
under consideration to fifteen. Stepwise multiple regression were then 
run to identify variables which contributed significantly to the pre- 
diction- To be included in the prediction equation a variable had to 
increase the squared multiple correlation significantly (p<.05). Be- 
cause the final regression model vas to be used to produce an interpre- 
tive report on school achievement for the benefit of principals and school 
system decision makers^ certain non-statistical criteria were applied. 
The model building procedures specified by Draper and Smith (1966) were 
followed vith the additional stipulation that the variables included in 
the final equation should have an educationally plausible relationship 
with reading achievement. Since the regression equations were to be used 
for predicting future achievement^ not simply fitting the data^ the veir- 
iables entered hiad to show stability over time . 

Analysis of the initial regression equations revealed tvo major 
problems. Firsts if all achievement test subtests were admitted to the 
independent variables^ then no background or mobility variables contrib- 
uted significantly to the prediction of the reading output scores. Fiu-- 
thermore, the pattern of significant achievement predictors over the thixee 
yeaxs was not stable « In subsequent analysis the achievement input was 
limited to the corresponding reading subtest. Second, when the achieve- 
ment input was restricted, the mobility aind background variables vhich 
contributed statistically to the prediction were not identical over the 
three years. The non-achievement variables, like the achievement subtc^sts, 
tended to be highly inter-correlated. A namber of different models were 
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were investigated (see Tables 1-^) for stability. The elimination of 
variables was based on both empirical and rational reasons. 

The residuals obtained with various models were correlated within 
each yeeir and across the unree years . The regression equation selected 
for the final report (Three Variable Model I971/72 ) was applied to the 
data from the other two years. Correlations were then run between these 
three sets of residmls to estimate the stability of a single model when 
applied to a different data base . 
Re£'ilts 

The multiple correlation coefficients for the various models across 
all three years are between .85 and .93- The data reported for a single 
year (Tables l-k ) aj/e representative of the results for the other two 
years. The variables included in the different models are the same across 
the three years. The regression coefficients are, oT course^ unique to 
the data within each year. V/hen achievement test inputs are restricted, 
the .nobility and background variables add signif icajitly to the prediction 
of output reading achievement. The number of non-achievement variables 
contributing significantly to the prediction varied from one to four 
over the three years . 

The correlation of residuals within year and across models were in 
the range .70 to .95 (Table 5)- In general, the addition of variables 
lowers the correlation o.C the residuals with those based on achievement 
alone . 

The correlations between residuals within the same model but across 
the three years (Table 6) ranged from .25 to .56. There is e. systematic 
reduction in the correlation between residuals as the number of veiriables 
admitted to the model increase. 



6 

The three variable model for 1971/7^ was applied to the input data 
for the other two years. The intercorrelations between residuals for the 
three years (Table 7) are fairly consistent and not surprisingly different 
from those obtained using the regression equations descriptively (Table 6). 
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Discussion 

The results of this study show that background and mobility variables 
caxL add significantly to the accuracy of multiple regression predictions of 
school output^ when the number of achievement test inputs is restricted. 
The non-achievement variables available for this study were all moderately 
or highly intercorrelated^ and their pattern of entry into zhe stepwise 
multiple regressions was not stable over the three yeeirs studied. The two 
non-achievement variables which showed the most stable behavior in our study 
were rather commonly reported school system descriptors (percent absence^ 
percent transfers). 

Since the residuals derived from the multiple regression analysis are 
zo be interpreted as estimates of a school's performance, the question of 
selecting the appropriate model becomes a potentially explosive one. This 
raises an important question: to what extent are the residuals obtained 
within a given year affected by the selection of different groups of input 
variables-; The correlations in Table 5 suggest that as the regression model 
is over-fitted the correlation of the residuals with those obtained from 
achievement input alone decreases. In absolute terms this can mean that the 
given school's actual mean achievement in reading can be described as 
significantly below expectation or moderately above expectation based solely 
on the choice of the regression model. Admittedly^ these cases are not 
common but they do occur in our data. 

The non-achievement variables used in our analyses were in many cetses 
highly accurate (e.g. head counts) but not necessarily well behaved^ in the 
sense of being normally distributed. This suggests that reliance on stat- 
istical probabilities alone in the selection of regression models would be 
unwise, particularly models based on a single year's data. The model veri- 
fication procedures suggested by Draper and Smith proved useful in eliminating 
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potentially troublesone input variables. Two t>^s of problems should be 
investigated carefully. First, sone variables are subject to redefinition 
by policy decision or charige in data collection. Second, some variables 
have spectacular "outliers" which can lead to spurious predicted outputs. 

The correlations of residuals within model but across the three years 
provides no evidence that the addition of non-achievement variables increases 
the stability of residuals across years. On the contrary the re 3ults indi- 
cate that the residuals oased on achievement input alone are more highly 
correlated than those obtained from the other r.oo^^s . The over-fitted models 

those containing marginally significant irput variables) are significantly 
less stable - 

The consecutive year stalrixities are much higher than those reported by 
rorsv^h and marginally 'vigher than those reported by Acland. The results 
obtained b: apply:.ng a single regression equation to the data from the three 
years (Table 7) show highly consistent correlations among the residuals. 
Of course correlations among residuals do not indicate '.he absolute magnitude 
of the changes in either actual output, gain scores from second to sixth 
grade, or even differences 'oetween residuals. 

The median differences in residuals for the same school for consecutive 
years were in the order of .5 years. These results and the variation in 
gain scores for the same schools over the same period indicate that there 
is substantial 'onexplained variation in school perfo.'raance asflociated vith 
a particular cohort. Acland fo^ond similar results in the New York Study. 
Since this variability cannot ^oe associated vith corresponding changes in 
the socioeconomic status of the parents in the school, it suggests that meas- 
•oi^ement of cohort parameters associated with a particular grade level, within 
a particular school (e.g. group values, leadership, etc.), might account for 
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a significetnt portion of school effectiveness. Since these variables are 
p].ausibly within the rajige of school influence such an investigation rr.ight 
be of considerable educational irnportajice . 

Estimates of stab.le school effects in the present study are comparable 
to those reported by Acland. While these estimates are smaller than some 
educators and sociologists would have predicted^ this does not eliminate the 
possibility that changes in school program can increase the impact of educa- 
tion. As many have pointed out^ the alternate hypothesis has still to be 
refuted; the hypothesis that on importeuit parameters schools are very much 
alike. The development and use of a baseline model for educational output 
would potentially allow all schools to climb above their previous effective- 
ness. The constant reapplicat ion of regression analysis to each year's data 
leads to the old "zero-sum" game where half of the schools are "below pre- 
diction" . 

The use of multiple regression methods for investigating school system 
performance or school performance is still a fairly untried^ or at least 
unreported ^activity . The results of existing empirical studies^ differing 
as they do in grade levels achievement test, eind methodology provide a 
fairly modest base for generalizing on the utility of the method. On the 
one hand; the method has proved particularly useful as a way of dealing with 
the large amounts of "messy data" collected about schools- Perhaps the most 
fruitful use of Dyer's proposal in the near future will be in decreasing 
the number of parameters considered important in raising school performance, 
as measured on standardized achievement tests. On the other hajid, the insights 
gained from using predictive models in education cannot be converted into 
"control models" because some of the iinportant variables are not subject to 
manipulation by educators or indeed the people in a free society. To 
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emphasize the tentative nature of the predictions made using our model, the 
resulting interpretive report was described as an acnievement forecast, not 
an expectation. 



Table 1. All Ten Variables as Predictors of Sixth Grade Reading 
Achievement (1971/72 ) 



Variable Multiple R Correlation with Criterion 



Percent Transfers Out 


.66 


.66 


Pupil/ Parent Factor 


.88* 


.80 


Second Grade Reading 


.91* 


• 79 


Percent on Welfare 


.92 


-•77 


Percent Absence 


.92* 


-.61 


Percent Transfers In 


.93* 


..78 


Pupil/Teacher Ratio 


•93 


.70 


Kuinber of Registered Voters 


^93 


.76 


liuniber of Free Lunches 


• 93 




Percent Voting 


• 93 


-.60 



Table 2. The Five Predictor Model of Sixth Grade Reading Achievement 
(1971/72) 



Variable 


Multiple 


Correlation with Criterion 


Second Grade Reading 


• 79* 


.79 


Percent on Welfare 


.88* 


-•77 


Percent Transfers In 


.90* 


-•78 


Pupil/ Parent Factor 


• 90 


.80 


Percent Absence 


•90 


-.61 



*p< .05 



Table 3- The Four Best Predictor Model of Sixth Grade Reading Achievement 
(1971/72) 



Variable 


Multiple R 


Correlation with Criterion 


Second Grade Reading 


.79 


.79 


Percent on Welfare 


.88 


-.77 


Percent Absence 


.89 


-.61 


Pupil/ Parent Factor 


.89 


.80 


^P< .05 






Table h . The Three 


Predictor Model of 


Sixth Grade Reading Achievement. 


Variable 


Multiple R 


Correlation with Criterion 



Second Grade Reading -79* .79 

Percent Transfers In .8^^-* -.78 

Percent Absence .66"^ -.61 



■^p <.05 



Table 5. Zero Order Correlations Between Residuals Within Year, 
Across Prediction Model (N^^^) 



Year 1970/71 





One 


Three 


Four 


Five 


Achievement 










Three Variable 


.82 








Four Variable 


.83 


.95 






Five Variable 


•77 


• 93 






Year 1971/72 


Achievement 










Three Variable 


•77 








Four Variable 


.82 


.85 






Five Variable 


•73 


.77 


.90 




Year 1972/73 


Achievement 










Three Variable 


.91 








Four Variable 


.76 


.81* 






Five Variable 


.70 


.77 


• 92 
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Table 6- Zero Order Correlations Between Residuals within Prediction 
Model, Across Years {N=6k) 



Achievement Alone 

70/71 71/72 72/73 

70/71 

71/72 .56 

72/73 M .5h 



Three Variable Prediction Model 

7077T 7T772 W7I 

70/71 

71/72 .51 

72/73 -hi .hS 



Four Variable Model 

Turn 7T772 7?m 

70/71 

71/72 .51 

72/73 .h2 .37 



Five Variable Model 

ToTTi 71772' 72/73 

70/71 

71/72 .k2 

72/73 -39 -25 



Table 7- Zero Order Correlations Between Residuals Obtained by Applying 
the Best Three Variable Mo ""el (1971/72) To Data for all Three 
Years . (N=66 ) 

70771 7177^ 

70/71 

71/72 M 

72/73 M .hk 
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